quick navigator
Products
Technologies
Development Tools
*Java* Support
*VTune Environments Supported
*Key Features
*System Requirements
*What People Say About VTune
*How to Order
*Case Studies
*30-Day Free Evaluation Copy of VTune
*Software Updates
*Technical Support
*VTune License Agreement
*Back to Intel Software Performance Products Home
Developer Home Contents Search Feedback Support Intel(r)

Analyzing a Simple C/C++ Application with VTune and Microsoft Visual C++*

This is a step-by-step process to show the highlights of using VTune with a C/C++ application. The user is encouraged to explore the VTune screens (ex: context sensitive help, display options, etc.) while going through these instructions.

These instructions reflect working with C/C++ programs using Microsoft’s development environment. Instructions for using other languages and environments will be placed on the following URL as new support is defined:
http://developer.intel.com/design/perftool/vtune/prod_rev.htm

These instructions require that VTune 2.5, and Microsoft Visual C++ 5.0 are installed.

Create and Compile the XForm Application

1. Invoke Visual C++ with no active project
2. Click File->New, select Files->Text, click OK. A blank text window appears.
3. Type in the following program:

#include <windows.h>
#include <stdio.h>
#include <winbase.h>
#include <time.h>

void main(int argc, char *argv[])
{
  int i, j;
  double dx[3000], dy[3000];
  clock_t startTime, endTime;

  for (i=0; i<3000; i++)
    dy[i] = dx[i] = i;

  startTime = clock();
  for (i=0; i<3000; i++)
    for (j=0; j<9000; j++)
      dy[i] += dx[i]/0.45;
  endTime = clock();

  printf ("\n\nProgram Completed.\n\n");
  printf ("Execution time in seconds: %f\n\n",
    (double) (endTime - startTime) / CLOCKS_PER_SEC);
  Sleep (3000);
}


4. Click File->SaveAs, type Xform.cpp as the filename, choose an appropriate directory (ex: c:\Xform).
5. Click Build->Build Xform.exe. A dialog box is generated asking to create a default workspace. Click Yes. The Xform application is then built. By default, Debug is turned on. VTune needs Debug information to display source code.
6. After it compiles correctly, click Build->Execute. The Xform application runs for several seconds, and then ends.

Find Xform’s Hotspot with VTune 2.5

1. Invoke VTune 2.5. If the “VTune Assistant” window appears, close it by unclicking the “Show on Startup” radio button and click the red X. It can be re-invoked with View->Assistant.
2. Click File->NewProject. The NewProject wizard will ask some questions:
    Program to test: Xform.exe with path, probably c:\Xform\Debug\Xform.exe. Click Next.
    Working directory: Directory that contains Xform.exe. Click Next.
    Command line parameters, Beginning and ending keystrokes: Leave Blank. Click Next.
    How long for application to run: 20 seconds. Click Next.
    Source Code Dir: Dir that contains Xform.cpp, probably c:\Xform. Click Next.
    VTune Output Dir: Dir that contains Xform.exe. Click Finish.
3. Click View->ProjectOptions-> Click Automation tab, Click on “Terminate Program When Monitoring Session ends”, Click Close.
4. Click the Run->StartMonitorSession command. This runs the Xform program.
5. After several seconds when the Xform window closes, if VTune is still sampling (“Session starting...” appears in VTune main menu window) click the Run->EndMonitorSession command. Several progress meters appear. VTune asks to Open the Session. Click Yes.
6. The modules report appears. Maximize the window. This shows a system wide view of software modules that executed in the system.
7. The modules are alphabetical from the top. Double click the Xform.exe line. The HotSpot window appears.
8. Double click on the longest red line, which is the hotspot.
9. The analysis window appears with the hotspot’s source code displayed. Note the line of source with the biggest “time” number to the left of the line. This is the source code for Xform’s hotspot.

Get Advice on Speeding Up Xform

1. Double click on the hotspot source line: dy[i] += dx[i]/s to invoke VTune’s C/C++ coach. C Coach asks for Source file information for Xform.cpp. Click “Manual Entry”, then OK. The coach indicates that no source options were specified. Click OK. The coach then runs for a short time and says that both dx and dy could be moved out of the inner loop. Click question marks next to the advice for more detail.
    Close the C/C++ coach window.
    At this point you may want to make the suggested changes to the source code of Xform, recompile and see how much it is improved.
2. Click the View->MixAssemblerAndSource command. Maximize the window.
Assembly language code for the program is displayed. It is annotated with CPU performance data for the instructions. Included are Pentium® processor pairing information (color coded on the left), Pentium processor clock cycle counts, performance penalties, and CPU usage percentages from the profiling done earlier. Other data may be requested (Pentium II processor micro-ops, decoder groups, etc) using the Options->ColumnDisplayOptions command.
3. VTune also gives instruction level advice. Double click the line that has the penalty “FP_Dep_ST(0)” on the right to learn more about ways performance can be improved Then close the Advanced Instruction Analyzer window.

Invoke Dynamic Analysis

1. Dynamic Analysis is for engineers that want very detailed information about dynamic processor performance issues.
2. Click View->Source.
3. Click on [the first “for (I=0; I<3000; I++)”] near or on line 12, Click DynamicAnalysis->SetEntryPoint.
4. Click on [the second “for (I=0; I<3000; I++)”] near or on line 16, Click DynamicAnalysis->SetExitPoint. This defines the part of the program that will be simulated for dynamic performance penalties, (ex: checking for cache performance issues, branch mispredictions, etc.).
5. Click DynamicAnalysis->InvokeDynamicAnalyzer. The Dynamic Analysis setup box appears. Click Start. The program will run for several seconds, followed by a box appearing indicating the number of instructions and clocks simulated. Click OK.
6. The Dynamic Analysis screen appears with instruction level details on performance issues. In the “Penalties and Warnings” column, the data is formatted as follows:
Occurances * Penalty : ClocksPerPenalty.
For example, 750 * DCache_Comp_Miss : 3, indicates 750 data cache compulsary misses (first time an address is referenced) with each miss causing 3 clocks of penalty for a total of 2250 clocks.
7. Double click on any of the instructions. Detailed descriptions of the penalties are displayed.
8. The count column indicates the number of times each instruction is executed. Clocks displays the percentage of CPU time that instruction consumed.


* Legal Information © 1998 Intel Corporation